Search CORE

207 research outputs found

Selenoprofiles: profile-based scanning of eukaryotic genome sequences for selenoprotein genes

Author: Birney
Burset
Cassago
Castellano
Copeland
Driscoll
Gromer
Grundner-Culemann
Harrow
Hatfield
Jiang
Kryukov
Li
Lobanov
Lobanov
M. Mariotti
Milinkovitch
Notredame
Novoselov
Novoselov
R. Guigo
Slater
Xu
Publication venue: Oxford University Press
Publication date: 01/01/2010
Field of study

Motivation: Selenoproteins are a group of proteins that contain selenocysteine (Sec), a rare amino acid inserted co-translationally into the protein chain. The Sec codon is UGA, which is normally a stop codon. In selenoproteins, UGA is recoded to Sec in presence of specific features on selenoprotein gene transcripts. Due to the dual role of the UGA codon, selenoprotein prediction and annotation are difficult tasks, and even known selenoproteins are often misannotated in genome databases

Crossref

PubMed Central

UPF Digital Repository

Gene finding in the chicken genome

Author: Antonarakis Stylianos E
Birney Ewan
Brent Michael R
Bye Jacqueline M
Camara Francisco
Castelo Robert
Eyras Eduardo
Flicek Paul
Guigo Roderic
Huckle Elizabeth J
Parra Genis
Reymond Alexandre
Rogers Jane
Shteynberg David D
Wyss Carine
Publication venue: BioMed Central
Publication date: 01/01/2005
Field of study

BACKGROUND: Despite the continuous production of genome sequence for a number of organisms, reliable, comprehensive, and cost effective gene prediction remains problematic. This is particularly true for genomes for which there is not a large collection of known gene sequences, such as the recently published chicken genome. We used the chicken sequence to test comparative and homology-based gene-finding methods followed by experimental validation as an effective genome annotation method. RESULTS: We performed experimental evaluation by RT-PCR of three different computational gene finders, Ensembl, SGP2 and TWINSCAN, applied to the chicken genome. A Venn diagram was computed and each component of it was evaluated. The results showed that de novo comparative methods can identify up to about 700 chicken genes with no previous evidence of expression, and can correctly extend about 40% of homology-based predictions at the 5' end. CONCLUSIONS: De novo comparative gene prediction followed by experimental verification is effective at enhancing the annotation of the newly sequenced genomes provided by standard homology-based methods

Springer - Publisher Connector

Serveur académique lausannois

Directory of Open Access Journals

PubMed Central

Digital Commons@Becker

UPF Digital Repository

Archive ouverte UNIGE

Definition of the Gene Content of the Human Genome: The Need for Deep Experimental Verification

Author: Aach
Altschul
Anamaria A. Camargo
Andrew J. G. Simpson
Batzoglou
Borsu
Clarevie
Clayton
Crollius
de Souza
Delcher
Dias-Neto
Dunham
Ewing
Fraser
Guigo
Hattori
Herzog
Kaufmann
Kawai
Lander
Levinson
Li
Liang
Maglott
Memes
Miyajima
Ricardo R. Brentani
Rother
Sandro J. de Souza
Valleix
Vanhee-Brossolet
Venter
Publication venue: Hindawi Publishing Corporation
Publication date: 01/01/2001
Field of study

Based on the analysis of the drafts of the human genome sequence, it is being speculated that our species may possess an unexpectedly low number of genes. The quality of the drafts, the impossibility of accurate gene prediction and the lack of sufficient transcript sequence data, however, render such speculations very premature. The complexity of human gene structure requires additional and extensive experimental verification of transcripts that may result in major revisions of these early estimates of the number of human genes

Crossref

Directory of Open Access Journals

PubMed Central

Using ESTs to improve the accuracy of de novo gene prediction

Author: A Krogh
AA Salamov
AC Siepel
C Wei
Chaochun Wei
DR Maglott
E Birney
I Korf
JE Allen
JE Allen
KD Pruitt
KD Pruitt
KL Howe
L Stein
LW Hillier
M Stanke
MG Reese
Michael R Brent
MJ van Baren
MR Brent
MS Boguski
P Flicek
R Guigo
R Guigó
R Mott
RA Gibbs
RH Brown
RH Waterston
S Foissac
SS Gross
The MGC Project Team
TW Harris
TW Harris
VV Solovyev
WJ Kent
Publication venue: BioMed Central
Publication date: 01/07/2006
Field of study

BACKGROUND: ESTs are a tremendous resource for determining the exon-intron structures of genes, but even extensive EST sequencing tends to leave many exons and genes untouched. Gene prediction systems based exclusively on EST alignments miss these exons and genes, leading to poor sensitivity. De novo gene prediction systems, which ignore ESTs in favor of genomic sequence, can predict such "untouched" exons, but they are less accurate when predicting exons to which ESTs align. TWINSCAN is the most accurate de novo gene finder available for nematodes and N-SCAN is the most accurate for mammals, as measured by exact CDS gene prediction and exact exon prediction. RESULTS: TWINSCAN_EST is a new system that successfully combines EST alignments with TWINSCAN. On the whole C. elegans genome TWINSCAN_EST shows 14% improvement in sensitivity and 13% in specificity in predicting exact gene structures compared to TWINSCAN without EST alignments. Not only are the structures revealed by EST alignments predicted correctly, but these also constrain the predictions without alignments, improving their accuracy. For the human genome, we used the same approach with N-SCAN, creating N-SCAN_EST. On the whole genome, N-SCAN_EST produced a 6% improvement in sensitivity and 1% in specificity of exact gene structure predictions compared to N-SCAN. CONCLUSION: TWINSCAN_EST and N-SCAN_EST are more accurate than TWINSCAN and N-SCAN, while retaining their ability to discover novel genes to which no ESTs align. Thus, we recommend using the EST versions of these programs to annotate any genome for which EST information is available. TWINSCAN_EST and N-SCAN_EST are part of the TWINSCAN open source software package

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Defining functional DNA elements in the human genome

Author: Bernstein B. E.
Birney E.
Crawford G. E.
Dekker J.
Dunham I.
Elnitski L. L.
Farnham P. J.
Feingold E. A.
Gerstein M.
Giddings M. C.
Gilbert D. M.
Gingeras T. R.
Green E. D.
Guigo R.
Hardison R. C.
Hubbard T.
Kellis M.
Kent J.
Kundaje A.
Lieb J. D.
Marinov G. K.
Myers R. M.
Pazin M. J.
Ren B.
Snyder M. P.
Stamatoyannopoulos J. A.
Ward L. D.
Weng Z. P.
White K. P.
Wold B.
Publication venue: 'Proceedings of the National Academy of Sciences'
Publication date: 01/04/2014
Field of study

With the completion of the human genome sequence, attention turned to identifying and annotating its functional DNA elements. As a complement to genetic and comparative genomics approaches, the Encyclopedia of DNA Elements Project was launched to contribute maps of RNA transcripts, transcriptional regulator binding sites, and chromatin states in many cell types. The resulting genome-wide data reveal sites of biochemical activity with high positional resolution and cell type specificity that facilitate studies of gene regulation and interpretation of noncoding variants associated with human disease. However, the biochemically active regions cover a much larger fraction of the genome than do evolutionarily conserved regions, raising the question of whether nonconserved but biochemically active regions are truly functional. Here, we review the strengths and limitations of biochemical, evolutionary, and genetic approaches for defining functional DNA segments, potential sources for the observed differences in estimated genomic coverage, and the biological implications of these discrepancies. We also analyze the relationship between signal intensity, genomic coverage, and evolutionary conservation. Our results reinforce the principle that each approach provides complementary information and that we need to use combinations of all three to elucidate genome function in human biology and disease

Cold Spring Harbor Laboratory Institutional Repository

Features generated for computational splice-site prediction correspond to functional elements

Author: A Goren
AJ McCullough
AJ McCullough
AL Blum
C Gooding
C Mathe
D Koller
G Kol
G Yeo
GE Crooks
H Liu
J Královicová
K Chua
K Han
KK Nelson
L Cartegni
L Mariño-Ramírez
Lise Getoor
LP Lim
LR Coulter
M Pertea
M Pertea
MB Stadler
ML Hastings
R Guigo
R Islamaj
R Islamaj Dogan
R Kohavi
R Singh
Rezarta Islamaj Dogan
S Degroeve
S Degroeve
Stephen M Mount
T Zhang
W John Wilbur
WG Fairbrother
XH Zhang
XH Zhang
XH Zhang
Y Yang
Z Wang
ZM Zheng
Publication venue: BioMed Central
Publication date: 01/10/2007
Field of study

Abstract Background Accurate selection of splice sites during the splicing of precursors to messenger RNA requires both relatively well-characterized signals at the splice sites and auxiliary signals in the adjacent exons and introns. We previously described a feature generation algorithm (FGA) that is capable of achieving high classification accuracy on human 3' splice sites. In this paper, we extend the splice-site prediction to 5' splice sites and explore the generated features for biologically meaningful splicing signals. Results We present examples from the observed features that correspond to known signals, both core signals (including the branch site and pyrimidine tract) and auxiliary signals (including GGG triplets and exon splicing enhancers). We present evidence that features identified by FGA include splicing signals not found by other methods. Conclusion Our generated features capture known biological signals in the expected sequence interval flanking splice sites. The method can be easily applied to other species and to similar classification problems, such as tissue-specific regulatory elements, polyadenylation sites, promoters, etc.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Digital Repository at the University of Maryland

The Origins, Evolution, and Functional Potential of Alternative Splicing in Vertebrates

Author: A. Frankish
A. Reymond
Altschul
Amit
Boguski
C. Howald
Cazalla
Cheah
Chen
Graveley
Hansen
J. Fernandez-Banet
J. Harrow
J. M. Mudge
Jekosch
Jurka
Kan
Kim
Kim
Koren
Langmead
Lareau
Lu
McGuire
Mendell
Modrek
Mortazavi
Nilsen
Ohoka
Pan
Pickrell
R. Guigo
Schwartz
Sela
Sela
Simpson
Slater
Sorek
Sorek
Sorek
Sprague
Stolc
Sureau
T. Alioto
T. Derrien
T. Hubbard
Wang
Wang
Publication venue: Oxford University Press
Publication date: 01/01/2011
Field of study

Alternative splicing (AS) has the potential to greatly expand the functional repertoire of mammalian transcriptomes. However, few variant transcripts have been characterized functionally, making it difficult to assess the contribution of AS to the generation of phenotypic complexity and to study the evolution of splicing patterns. We have compared the AS of 309 protein-coding genes in the human ENCODE pilot regions against their mouse orthologs in unprecedented detail, utilizing traditional transcriptomic and RNAseq data. The conservation status of every transcript has been investigated, and each functionally categorized as coding (separated into coding sequence [CDS] or nonsense-mediated decay [NMD] linked) or noncoding. In total, 36.7% of human and 19.3% of mouse coding transcripts are species specific, and we observe a 3.6 times excess of human NMD transcripts compared with mouse; in contrast to previous studies, the majority of species-specific AS is unlinked to transposable elements. We observe one conserved CDS variant and one conserved NMD variant per 2.3 and 11.4 genes, respectively. Subsequently, we identify and characterize equivalent AS patterns for 22.9% of these CDS or NMD-linked events in nonmammalian vertebrate genomes, and our data indicate that functional NMD-linked AS is more widespread and ancient than previously thought. Furthermore, although we observe an association between conserved AS and elevated sequence conservation, as previously reported, we emphasize that 30% of conserved AS exons display sequence conservation below the average score for constitutive exons. In conclusion, we demonstrate the value of detailed comparative annotation in generating a comprehensive set of AS transcripts, increasing our understanding of AS evolution in vertebrates. Our data supports a model whereby the acquisition of functional AS has occurred throughout vertebrate evolution and is considered alongside amino acid change as a key mechanism in gene evolution

CiteSeerX

Crossref

PubMed Central

UPF Digital Repository

King's Research Portal

Perivascular spaces are associated with tau pathophysiology and synaptic dysfunction in early Alzheimer’s continuum

Author: Arenaza-Urquijo EM
Beteta A
Blennow K
Brugulat A
Cacciaglia R
Cañas A
Ciampa I
Crous-Bou M
Cumplido I
Deulofeu C
Dominguez R
Emilio M
Falcón C
Fauria K
Fuentes S
Gispert JD
Grau-Rivera O
Guigo R
Hernandez L
Huesa G
Huguet J
Kollmorgen G
Marne P
Menchón T
Milà-Alomà M
Minguillon C
Molinuevo JL
Operto G
Polo A
Pradas S
Rodriguez-Fernandez B
Sala-Vila A
Salvadó G
Shekari M
Soteras A
Suárez-Calvet M
Sánchez-Benavides G
Vilanova M
Vilor-Tejedor N
Zetterberg H
Publication venue
Publication date: 05/08/2021
Field of study

Background: Perivascular spaces (PVS) have an important role in the elimination of metabolic waste from the brain. It has been hypothesized that the enlargement of PVS (ePVS) could be affected by pathophysiological mechanisms involved in Alzheimer’s disease (AD), such as abnormal levels of CSF biomarkers. However, the relationship between ePVS and these pathophysiological mechanisms remains unknown. Objective: We aimed to investigate the association between ePVS and CSF biomarkers of several pathophysiological mechanisms for AD. We hypothesized that ePVS will be associated to CSF biomarkers early in the AD continuum (i.e., amyloid positive cognitively unimpaired individuals). Besides, we explored associations between ePVS and demographic and cardiovascular risk factors. Methods: The study included 322 middle-aged cognitively unimpaired participants from the ALFA + study, many within the Alzheimer’s continuum. NeuroToolKit and Elecsys® immunoassays were used to measure CSF Aβ42, Aβ40, p-tau and t-tau, NfL, neurogranin, TREM2, YKL40, GFAP, IL6, S100, and α-synuclein. PVS in the basal ganglia (BG) and centrum semiovale (CS) were assessed based on a validated 4-point visual rating scale. Odds ratios were calculated for associations of cardiovascular and AD risk factors with ePVS using logistic and multinomial models adjusted for relevant confounders. Models were stratified by Aβ status (positivity defined as Aβ42/40 < 0.071). Results: The degree of PVS significantly increased with age in both, BG and CS regions independently of cardiovascular risk factors. Higher levels of p-tau, t-tau, and neurogranin were significantly associated with ePVS in the CS of Aβ positive individuals, after accounting for relevant confounders. No associations were detected in the BG neither in Aβ negative participants. Conclusions: Our results support that ePVS in the CS are specifically associated with tau pathophysiology, neurodegeneration, and synaptic dysfunction in asymptomatic stages of the Alzheimer’s continuum

UCL Discovery

Modelling Reveals Kinetic Advantages of Co-Transcriptional Splicing

Author: A Audibert
A Shatkin
AR Kornblihtt
B Rutz
CC Query
CK Mapendano
CW Pikielny
DA Brow
DF Tardiff
DJ Smith
F Brabant
F Carrillo Oesterreich
F Rigo
H Akaike
H Kimura
I Listerman
J Rino
J Rino
J Schaber
J Singh
Jean D. Beggs
JM Pedraza
K Struhl
KM Neugebauer
KP Burnham
L Liu
M Dundr
M Voliotis
MC Wahl
MJ Hicks
N Proudfoot
O Kessler
OR Gonzalez
PJ Hilleren
R Alexander
R Alexander
R Perales
R Reed
RD Alexander
RF Luco
Roderic Guigo
Ross D. Alexander
S Aitken
S Boireau
S Ramsey
Stuart Aitken
U Schmidt
V Iyer
X Darzacq
X Darzacq
Y Shav-Tal
Publication venue: Public Library of Science
Publication date: 01/01/2011
Field of study

Messenger RNA splicing is an essential and complex process for the removal of intron sequences. Whereas the composition of the splicing machinery is mostly known, the kinetics of splicing, the catalytic activity of splicing factors and the interdependency of transcription, splicing and mRNA 3′ end formation are less well understood. We propose a stochastic model of splicing kinetics that explains data obtained from high-resolution kinetic analyses of transcription, splicing and 3′ end formation during induction of an intron-containing reporter gene in budding yeast. Modelling reveals co-transcriptional splicing to be the most probable and most efficient splicing pathway for the reporter transcripts, due in part to a positive feedback mechanism for co-transcriptional second step splicing. Model comparison is used to assess the alternative representations of reactions. Modelling also indicates the functional coupling of transcription and splicing, because both the rate of initiation of transcription and the probability that step one of splicing occurs co-transcriptionally are reduced, when the second step of splicing is abolished in a mutant reporter

Public Library of Science (PLOS)

CiteSeerX

Crossref

Heriot Watt Pure

Directory of Open Access Journals

PubMed Central

Edinburgh Research Explorer

Multiple Chromosomal Rearrangements Structured the Ancestral Vertebrate Hox-Bearing Protochromosomes

Author: A Martin
A McLysaght
A Meyer
AH Neidert
AL Evans
AL Hufton
AL Hughes
AL Hughes
AL Hughes
AL Hughes
BP Chowdhary
BR Holland
C Kappen
C Popovici
D Larhammar
DE Ferrier
DW Stock
F van der Hoeven
GAT McVean
GP Wagner
Günter P. Wagner
H Kishino
H Shimodaira
H Shimodaira
H Shimodaira
J Bergsten
J Kim
J Spring
J Zhang
JP Huelsenbeck
KD Crow
LG Lundin
M Anisimova
M Holder
M Kohn
M Sémon
N Goldman
O Pontes
P Dehal
R Friedman
R Friedman
R Furlong
R Guigo
R Phillips
RC Edgar
RC Edgar
S Guindon
S Ohno
SG Gregory
T Keane
T Marques-Bonet
Takashi Gojobori
V Lynch
Vincent J. Lynch
WJ Bailey
WJ Murphy
X Gu
Y Nakatani
Y Wang
Publication venue: Public Library of Science
Publication date: 01/01/2009
Field of study

While the proposal that large-scale genome expansions occurred early in vertebrate evolution is widely accepted, the exact mechanisms of the expansion—such as a single or multiple rounds of whole genome duplication, bloc chromosome duplications, large-scale individual gene duplications, or some combination of these—is unclear. Gene families with a single invertebrate member but four vertebrate members, such as the Hox clusters, provided early support for Ohno's hypothesis that two rounds of genome duplication (the 2R-model) occurred in the stem lineage of extant vertebrates. However, despite extensive study, the duplication history of the Hox clusters has remained unclear, calling into question its usefulness in resolving the role of large-scale gene or genome duplications in early vertebrates. Here, we present a phylogenetic analysis of the vertebrate Hox clusters and several linked genes (the Hox “paralogon”) and show that different phylogenies are obtained for Dlx and Col genes than for Hox and ErbB genes. We show that these results are robust to errors in phylogenetic inference and suggest that these competing phylogenies can be resolved if two chromosomal crossover events occurred in the ancestral vertebrate. These results resolve conflicting data on the order of Hox gene duplications and the role of genome duplication in vertebrate evolution and suggest that a period of genome reorganization occurred after genome duplications in early vertebrates

Crossref

Directory of Open Access Journals

PubMed Central